The graphs below don’t have proper titles, axis labels, legends, etc. Please take care to do this on your own graphs.
We can create static maps (road maps, satellite view maps, hybrid maps) with the ggmap package:
library(tidyverse)
library(ggmap)
map_base <- get_map(location = c(lon = -79.944248, lat = 40.4415861),
color = "color",
source = "google",
maptype = "road",
zoom = 6)
map_object <- ggmap(map_base,
extent = "device",
ylab = "Latitude",
xlab = "Longitude")
map_object
Here, we’ll work with airline data from this GitHub repository.
Max Marchi wrote a great summary of using maps in R with ggmap here. We’ll follow this example closely in today’s lecture. Thanks, Max!
Before we begin, note that this is just one example of how you can add interesting information to maps with ggmap. As long as you have latitude and longitude information, you should be able to add data to maps. For more interesting examples and for an in-depth description of ggmap, see the short paper by David Kahle and Hadley Wickham here.
To load data from GitHub, you should navigate to the raw file, copy the URL, and use read.csv().
# Load and format airports data
airports <- read_csv("https://raw.githubusercontent.com/jpatokal/openflights/master/data/airports.dat",
col_names = c("ID", "name", "city", "country", "IATA_FAA",
"ICAO", "lat", "lon", "altitude", "timezone", "DST"))
airports
## # A tibble: 7,184 × 11
## ID name city
## <int> <chr> <chr>
## 1 1 Goroka Airport Goroka
## 2 2 Madang Airport Madang
## 3 3 Mount Hagen Kagamuga Airport Mount Hagen
## 4 4 Nadzab Airport Nadzab
## 5 5 Port Moresby Jacksons International Airport Port Moresby
## 6 6 Wewak International Airport Wewak
## 7 7 Narsarsuaq Airport Narssarssuaq
## 8 8 Godthaab / Nuuk Airport Godthaab
## 9 9 Kangerlussuaq Airport Sondrestrom
## 10 10 Thule Air Base Thule
## # ... with 7,174 more rows, and 8 more variables: country <chr>,
## # IATA_FAA <chr>, ICAO <chr>, lat <dbl>, lon <dbl>, altitude <int>,
## # timezone <dbl>, DST <chr>
# Load and format routes data
routes <- read_csv("https://raw.githubusercontent.com/jpatokal/openflights/master/data/routes.dat",
col_names = c("airline", "airlineID", "sourceAirport",
"sourceAirportID", "destinationAirport",
"destinationAirportID", "codeshare", "stops",
"equipment"))
routes
## # A tibble: 67,663 × 9
## airline airlineID sourceAirport sourceAirportID destinationAirport
## <chr> <chr> <chr> <chr> <chr>
## 1 2B 410 AER 2965 KZN
## 2 2B 410 ASF 2966 KZN
## 3 2B 410 ASF 2966 MRV
## 4 2B 410 CEK 2968 KZN
## 5 2B 410 CEK 2968 OVB
## 6 2B 410 DME 4029 KZN
## 7 2B 410 DME 4029 NBC
## 8 2B 410 DME 4029 TGK
## 9 2B 410 DME 4029 UUA
## 10 2B 410 EGO 6156 KGD
## # ... with 67,653 more rows, and 4 more variables:
## # destinationAirportID <chr>, codeshare <chr>, stops <int>,
## # equipment <chr>
Here, we’ll do some data manipulation to obtain the number of arrivals/departures per airport.
# Manipulate the routes data to create two new data.frames
# one for arrivals, one for departures.
# Each counts the number of flights arriving/departing from each airport.
departures <- routes %>%
group_by(sourceAirportID) %>%
summarize(flights = n()) %>%
mutate(sourceAirportID = as.integer(as.vector(sourceAirportID)))
arrivals <- routes %>%
group_by(destinationAirportID) %>%
summarize(flights = n()) %>%
mutate(destinationAirportID = as.integer(as.vector(destinationAirportID)))
# Merge each of the arrivals/departures data.frames with the airports data.frame above
airportD <- left_join(airports, departures, by = c("ID" = "sourceAirportID"))
airportA <- left_join(airports, arrivals, by = c("ID" = "destinationAirportID"))
# Create data.frame of routes to/from a specific city
my_airport_code <- "PIT"
my_routes <- filter(routes,
sourceAirport == my_airport_code |
destinationAirport == my_airport_code)
# Add in relevant information from the airports data.frame
# Do this in two steps, so that you first pull in the source airport information
# and then pull in the destination airport information
# This can be done easily with two calls to left_join()
# The rest of the code is just formatting
my_airport_data <- my_routes %>%
left_join(airports, by = c("sourceAirport" = "IATA_FAA")) %>%
select(destinationAirport, lat, lon, timezone) %>%
rename(source_lat = lat, source_lon = lon, source_timezone = timezone) %>%
left_join(airports, by = c("destinationAirport" = "IATA_FAA")) %>%
select(source_lat, source_lon, source_timezone, lat, lon, timezone) %>%
rename(dest_lat = lat, dest_lon = lon, dest_timezone = timezone)
We’ll use ggmap to create a base map of the data. Now, we’ll add layers of points onto the map.
# Use ggmap to visualize the European flights
library(ggmap)
map <- get_map(location = 'Europe', zoom = 4)
map <- get_map(location = 'United States', zoom = 4)
# Visualize the basic map
ggmap(map)
# Add points to the map of departures
# Each point will be located at the lat/long of the airport
# The size of the points is proportional to the square root of the number of flights at that airport
mapPoints <- ggmap(map) +
geom_point(aes(x = lon, y = lat, size = sqrt(flights), color = DST),
data = airportD, alpha = .5)
# Add a custom legend to the plot
mapPointsLegend <- mapPoints +
scale_size_area(breaks = sqrt(c(1, 5, 10, 50, 100, 500)),
labels = c(1, 5, 10, 50, 100, 500),
name = "departing routes")
mapPointsLegend
Below, we create a new variable (flight type – arrival or departure) and use it to create a single data.frame of all arrivals and departures.
We’ll then facet on this variable so we can simultaneously visualize both arriving and departing flights.
# Create a data.frame containing both departures and arrivals
# Do this by creating a "type" variable and then combining the two data.frames
# We will later be able to use the type variable for facetting
airportD$type <- "departures"
airportA$type <- "arrivals"
airportDA <- bind_rows(airportD, airportA)
# Create our base plot and add the custom legend
mapPointsDA <- ggmap(map) +
geom_point(aes(x = lon, y = lat, size = sqrt(flights)),
data = airportDA, alpha = .5) +
scale_size_area(breaks = sqrt(c(1, 5, 10, 50, 100, 500)),
labels = c(1, 5, 10, 50, 100, 500),
name = "routes")
# Facet on flight type (arrival/departure)
mapPointsDA + facet_grid(. ~ type)
ggmapgc <- geocode("carnegie mellon university")
map <- get_map(gc, zoom = 6)
ggmap(map) +
geom_point(aes(x = lon, y = lat),
data = gc, size = 5, color = "red")
What is a polygon?
geom_polygon()In ggplot(), polygons are just another geometry, making it really easy to add geographic shapes (e.g. corresponding to countries, states, counties, etc.) to maps.
# Note: The sp package can be really fussy at installation
# If prompted, do not restart R when installing the package
# If prompted with "Do you want to install from sources the packages which need compilation?", type 'n'
#install.packages("sp", dependencies = T)
library(sp)
library(ggmap)
us_data <- map_data("state")
county_data <- map_data("county")
us_county_map <- ggplot() +
geom_polygon(aes(long, lat, group=group), fill = "blue", size = 4,
data = county_data) +
geom_polygon(aes(long, lat, group=group), color = 'white',
fill = NA, data = us_data) +
theme_bw() + theme(axis.text = element_blank(),
axis.title = element_blank())
us_county_map
coord_map() to specify your map projectionWay back at the beginning of the semester, we learned about using coord_cartesian() and coord_polar() to specify the coordinates of our plots. We’re (finally) revisiting this idea in the context of geographic mapping.
Using the coord_map() function, we can specify what kind of map projection we want to use.
us_county_map + coord_map("mercator")
us_county_map + coord_map("polyconic")
us_county_map + coord_map("sinusoidal")